Ranking Documents in Thesaurus-Based Boolean Retrieval Systems
نویسندگان
چکیده
In this paper we investigate document ranking methods in thesaurus-based boolean retrieval systems, and propose a new thesaurus-based ranking algorithm called the Extended Relevance (E-Relevance) algorithm. The E-Relevance algorithm integrates the extended boolean model and the thesaurus-based relevance algorithm. Since the E-Relevance algorithm has all the desirable properties of the extended boolean model, it avoids the various problems of previous thesaurus-based ranking algorithms. The E-Relevance algorithm also ranks documents effectively by using term dependence information from the thesaurus. We have shown through performance comparison that the proposed algorithm achieves higher retrieval effectiveness than the others proposed earlier.
منابع مشابه
Norbert Fuhr Information Retrieval Methods for Literary Texts
Information retrieval focuses on content-based searching in text documents. For this purpose, first text content must be represented, by using a representation language (like thesauri or classification schemes) or by performing free-text search. The latter approach uses either string-based or computer-linguistic methods (stemming, dictionary lookup, syntax analysis). For retrieval, weighting an...
متن کاملSemantic-based Medical Records Retrieval via Medical-context Aware Query Expansion and Ranking
Efficient retrieval of medical records involves contextual understanding of both the query and the records contents. This will enhance the searching effectiveness beyond merely keyword matching and is assisted by analyzing its semantics notion such as by the utilization of the MeSH thesaurus. The query is annotated and expanded by information from the deep medical contextual understanding. This...
متن کاملPartial Boolean Algebras as Models for Thesaurus Integration
A model of a collection of documents based on partial Boolean algebras is presented. This model has been considered while analysing a problem of integration of thesauri. Some properties of partial Boolean algebras are exploited in defining theoretical tools for information retrieval associated to this model. Such tools are a logical language representing queries to the system and a browsing mec...
متن کاملDocument Ranking Method for High Precision Rate
Many information retrieval(IR) systems retrieve relevant documents based on exact matching of keywords between a query and documents. This method degrades precision rate. In order to solve the problem, we collected semantically related words and assigned semantic relationships used in general thesaurus and a special relationship called keyfact term(FT) manually. In addition to the semantic know...
متن کاملFujitsu Laboratories Trec7 Report 2 System Description 2.1 Overall 2.2 the Search System Tera
In our rst participation in TREC, our focus was on improving the basic ranking systems and applying text clustering techniques for query expansion. We tested a variety of techiniques including reference measures, passage retrieval, and data fusion for the basic ranking systems. Some techiniques were used in the o cial run, others were not used because of time limitations. We applied the text cl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Inf. Process. Manage.
دوره 30 شماره
صفحات -
تاریخ انتشار 1994